Skip to content

Conversation

@fede-bello
Copy link

@fede-bello fede-bello commented Nov 26, 2025

Lazy Loading Support for Pydantic Settings Sources

Summary

This PR implements lazy loading for settings sources, deferring field value resolution until fields are accessed rather than fetching all values during initialization. This enables significant performance improvements for expensive operations such as API calls to cloud secret managers.

Solves #713

What Changed

  • Added a new lazy_load parameter to GCP Secret Manager settings source (GCPSecretManagerSettingsSource) to opt into lazy loading.
  • Implemented lazy loading support in the base class, ensuring all providers can benefit with minimal changes.
  • Introduced the internal LazyMapping mechanism for deferring and caching per-field lookups.
  • Updated tests to cover lazy loading behavior and environment-based sources.
    • Performed integration testing specifically for the GCP provider.

Problem

Currently, all settings sources eagerly fetch values for every field during Settings instantiation, even if those fields are never accessed. This is problematic for expensive operations:

  • API calls to cloud secret managers (GCP Secret Manager, AWS Secrets Manager, Azure Key Vault)
  • Large file reads from secrets directories
  • Network roundtrips that could be avoided

Solution: LazyMapping

The implementation introduces a LazyMapping class, a dict-like mapping that:

  • Defers field value resolution until keys are accessed via __getitem__()
  • Caches computed values to avoid redundant operations
  • Implements the Mapping ABC for compatibility with Pydantic's initialization

When lazy_load=True:

  • Settings sources return an empty dict from __call__()
  • A LazyMapping is stored on source._lazy_mapping
  • Field values are only fetched when explicitly accessed

Test Coverage

  • unit tests covering LazyMapping behavior and GCP Secret Manager

Note: Integration tests were performed for GCP Secret Manager. The fix is implemented at the PydanticBaseEnvSettingsSource class level, so all inheriting providers automatically support lazy loading. The parameter was only added for GCP Secret Manager, but extending it to new providers should be as simple as adding the parameter.

Why LazyMapping

Backward Compatibility

lazy_load defaults to False, preserving eager loading behavior.

Alternative Approaches Considered

I don’t think this is the most intuitive implementation, and I initially wanted something simpler. However, I've discussed some other options and nothing convinced me:

  1. Lazy attribute access on Settings (__getattr__)

    • Idea: Fetch values only when you access them (e.g., settings.db_password)
    • Problem: Requires hacky code that intercepts all field access. Your IDE won't know what fields exist, autocomplete breaks, and it breaks every time Pydantic updates.
  2. Separate LazySettings class

    • Idea: Have two different Settings classes—one eager, one lazy. Pick which to use.
    • Problem: Users have to decide at import time. Can't mix lazy and eager sources together.
  3. Property-based field access

    • Idea: Turn each Settings field into a function/property that fetches on demand
    • Problem: Users would have to change how they define every single field in their code. Your IDE won't understand the types anymore.
  4. Async initialization

    • Idea: Use async def __init__() to fetch values asynchronously
    • Problem: Would break existing code massively. Every Settings instantiation would need await. Too invasive.

@hramezani
Copy link
Member

hramezani commented Nov 26, 2025

Thanks @fede-bello for the PR.

I think we only want it for GoogleSecretManagerSettingsSource.

I think it doesn't make sense to have lazy loading for the env source or dotenv.

People usually initialize settings on application startup and they usually do it once.

@fede-bello
Copy link
Author

Do we want it for other cloud providers? AWS or Azure? Or just GCP that it's what I was able to test?

@fede-bello fede-bello force-pushed the feat/lazy-load-support branch from 230dc09 to 482b847 Compare November 26, 2025 19:10
@fede-bello fede-bello marked this pull request as draft November 26, 2025 19:13
@fede-bello fede-bello force-pushed the feat/lazy-load-support branch 3 times, most recently from 16e6fa8 to e401bc4 Compare November 26, 2025 19:41
@hramezani
Copy link
Member

Do we want it for other cloud providers? AWS or Azure? Or just GCP that it's what I was able to test?

Let's do it for GCP now because you can test it and probably maintain it later.

@fede-bello fede-bello force-pushed the feat/lazy-load-support branch 3 times, most recently from ced9069 to a4e0d5b Compare November 27, 2025 17:37
@fede-bello fede-bello marked this pull request as ready for review November 27, 2025 17:37
@fede-bello fede-bello changed the title Feat/lazy load support Feature: Add lazy load support in GCP Nov 27, 2025
Comment on lines +2447 to +2448
1. **Initialization**: Settings are created with minimal overhead. Sources return empty dictionaries instead of eagerly fetching all values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two questions:

  1. What happens if other sources' values have more priority than GCP settings source?
  2. What happens if the value provided by a source is not a valid value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Not sure in what sense you mean. If I understand your question correctly, higher priority sources shadow lower ones. So if theres a key in a higher priority source, that one will be loaded and the one for GCP won't be consulted
  2. Trying to access will return None. It will enter the get_field_value method in EnvSettingsSource, the field_value will not be found and will return None

PS: left a fix with an issue with the model dump that wasn't loading the lazy fields

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to access will return None. It will enter the get_field_value method in EnvSettingsSource, the field_value will not be found and will return None

I mean if the value is not a valid value for the field. like you defined an int field but the value is string. or you put some limitation on the string length.

@fede-bello fede-bello force-pushed the feat/lazy-load-support branch from a4e0d5b to ec75ebd Compare November 28, 2025 17:11
@fede-bello fede-bello requested a review from hramezani November 28, 2025 17:13
@fede-bello fede-bello force-pushed the feat/lazy-load-support branch from ec75ebd to 3894a89 Compare November 28, 2025 17:39
env_ignore_empty: bool | None = None,
env_parse_none_str: str | None = None,
env_parse_enums: bool | None = None,
lazy_load: bool | None = None,
Copy link
Member

@hramezani hramezani Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we keep this here? we agreed to enable lazy loading for GCP secret source

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left it there so eventually is easier to implement for other sources providers and it's not a only-gcp fix. It the parameter is not passed to the base class the logic shouldn't change

@hramezani
Copy link
Member

@fede-bello I think this is going to be complicated. Like we need to add and maintain a lot of code, and the lazy loading values from GCP is not important IMHO.
pydantic-settings model usually will be initialized once at startup and will be part of the bootstrap process, which generally takes time.

As pydantic-settings lets you have your own custom settings source, and GCP secret source is not one of our most used source, it would probably be good not to add lazy loading to the package.

What do you think?

@fede-bello
Copy link
Author

@hramezani

I don't know if there's a more straightforward way to implement the feature

The thing is that we live far from our gcp instances are, and each secret fetch can take up to 2-3 seconds. It's not the worst but it can get annoying when having a lot of secrets and performing short tests. A 20 second start up just to try a path that doesn't use a secret

Maybe we can add a wrapper to the gcp ourselves and not even have to modify the library, but seems like a issue that more people might have in the future

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants